Time alignment for scenario and sounds with voice, music and BGM

نویسندگان

  • Yamato Wada
  • Masahide Sugiyama
چکیده

This paper proposes a new time alignment method between scenario and sounds with voice, music and BGM (Back Ground Music) in order to generate video caption automatically. The proposed time alignment method, Voice-Music-Pause+BGM method, is based on the composition of voice and music models. The result of the experiments to evaluate the proposed method shows the proposed method works about 10 60 times better than the conventional time alignment methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تعبیر عناصر موسیقی و معماری با زبان مشترک نمونه‌موردی موسیقی و معماری سنتی ایران

Every nation has its own ideals that objectifying them is by culture. In the process of this transformation, the architecture and music which come from the culture of each region have a fundamental role. Architecture and music have many similarities in conceptual directions, space, shape and morphology. They also have similarities in the basic and fundamental principles such as proportions, rhy...

متن کامل

Language, Music, and Brain

Introduction: Over the last centuries, scientists have been trying to figure out how the brain is learning the language. By 1980, the study of brain-language relationships was based on the study of human brain damage. But since 1980, neuroscience methods have greatly improved. There is controversy about where music, composition, or the perception of language and music are in the brain, or wheth...

متن کامل

An Analysis of Achievement of the Philosophical Sense of “Extension” in Music, with Interpretaion of Ibn-e Sina’s Explanation an Extension

This research can be considered as one of the studies that seek to explore, in an argumentative way, subtle and solid philosophical concepts in the field of art. The paper provides an analysis of the concept of “extension” in music as one of the most thought-provoking philosophical concepts. The analysis is carried out by interpreting Ibn-Sina’s special conception of musical extension to answer...

متن کامل

طراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی

Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...

متن کامل

A Study on Bag of Gaussian Model with Application to Voice Conversion

The GMM based mapping techniques proved to be an efficient method to find nonlinear regression function between two spaces, and found success in voice conversion. In these methods, a linear transformation is estimated for each Guassian component, and the final conversion function is a weighted summation of all linear transformations. These linear transformations fit well for the samples near to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003